Parsing spontaneous speech

نویسنده

  • Rodolfo Delmonte
چکیده

In this paper we will present work carried out lately on the 50,000 words Italian Spontaneous Speech Corpus called AVIP, under national project API, made available for free download from the website of the coordinator, the University of Naples. We will concentrate on the tuning of the parser for Italian which had been previously used to parse 100,000 words corpus of written Italian within the National Treebank initiative coordinated by ILC in Pisa. The parser receives as input the adequately transformed orthographic transcription of the dialogues making up the corpus, in which pauses, hesitations and other disfluencies have been turned into most likely corresponding punctiation marks, interjections or truncation of the word underlying the uttered segment. The most interesting phenomenon we will discuss is without any doubts "overlapping", i.e. a speech event in which two people speak at the same time by uttering actual words or in some cases nonwords, when one of the speakers, usually the one which is not the current turntaker, interrupts the current speaker. This phenomenon takes place at a certain point in time where it has to be anchored to the speech signal but in order to be fully parsed and subsequently semantically interpreted, it needs to be referred semantically to a following turn.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An improved joint model: POS tagging and dependency parsing

Dependency parsing is a way of syntactic parsing and a natural language that automatically analyzes the dependency structure of sentences, and the input for each sentence creates a dependency graph. Part-Of-Speech (POS) tagging is a prerequisite for dependency parsing. Generally, dependency parsers do the POS tagging task along with dependency parsing in a pipeline mode. Unfortunately, in pipel...

متن کامل

Adapting dependency parsing to spontaneous speech for open domain spoken language understanding

Parsing human-human conversations consists in automatically enriching text transcription with semantic structure information. We use in this paper a FrameNet-based approach to semantics that, without needing a full semantic parse of a message, goes further than a simple flat translation of a message into basic concepts. FrameNet-based semantic parsing may follow a syntactic parsing step, howeve...

متن کامل

Spoken Language Understanding in Spoken Dialog System for Travel Information Accessing

This paper briefly introduces the VOTIRS 2.0 system — a Chinese spoken dialog system for travel information accessing. Through this system, users can query the travel information about 52 routes and finally make a transaction for a proper travel plan. The strategy and method to understand spontaneous speech in the system are discussed in detail. To understand spontaneous speech, a Semantic Cons...

متن کامل

Parsing spoken language without syntax

Parsing spontaneous speech is a difficult task because of the ungrammatical nature of most spoken utterances. To overpass this problem, we propose in this paper to handle the spoken language without considering syntax. We describe thus a microsemantic parser which is uniquely based on an associative network of semantic priming. Experimental results on spontaneous speech show that this parser st...

متن کامل

Parsing Spoken Language without Syntax : a Microsemantic Approach

Parsing spontaneous speech is a difficult task because of the ungrammatical nature of most spoken utterances. To overpass this problem, we propose in this paper to handle the spoken language without considering syntax. We describe thus a microsemantic parser which is uniquely based on an associative network of semantic priming. Experimental results on spontaneous speech show that this parser st...

متن کامل

Study on Detection of Prosodic Phrase Boundaries in Spontaneous Speech

Prosodic information, which has the abilities of disambiguation, improving the parsing of the spoken language and predicting recognition errors, becomes more and more important in speech recognition and understanding, especially in spontaneous speech. In this paper, we investigate the detection of the phrase boundaries by prosodic features in the domain-specified Chinese spontaneous speech. The...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003